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Abstract 

We study the entropy landscape of solutions for the bicoloring problem in random graphs, a 
representative difficult constraint satisfaction problem. Our goal is to classify which type of clusters 
of solutions are addressed by different algorithms. In the first part of the study we use the cavity 
method to obtain the number of clusters with a given internal entropy and determine the phase 
diagram of the problem, e.g. dynamical, rigidity and SAT-UNSAT transitions. In the second part 
of the paper we analyze different algorithms and locate their behavior in the entropy landscape 
of the problem. For instance we show that a smoothed version of a decimation strategy based on 
Belief Propagation is able to find solutions belonging to sub-dominant clusters even beyond the 
so called rigidity transition where the thermodynamically relevant clusters become frozen. These 
non-equilibrium solutions belong to the most probable unfrozen clusters. 
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I. INTRODUCTION AND MOTIVATIONS 



SPs). 

m 



Many disciplines have at their root random Constraint Satisfaction Problems (CSPs). 
Examples are Information Theory where they are used to design error correcting codes 
or Computer Science where they constitute elementary models for studying the onset of 
exponential regimes in algorithms [sl. More in general, random CSPs capture some of the 
optimization aspects of complex systems found in physics (e.g. spin-glasses and packing 
problems), in economics (e.g. financial markets)!^, l5| and in biology (e.g. gene networks 
reconstruction and learning in neuroscience 0, Ej). A random CSP is characterized by 



an extensive list of constraints, each one forbidding some of the joint assignments of the 
(discrete) variables it involves. In packing problems for instance, overlapping positions of 
the elementary tiles on a given lattice are forbidden. Given an instance of a CSP, one wants to 
know whether there exists a solution, that is an assignment of the variables which satisfies all 
the constraints (e.g. a proper tiling or a proper coloring of a graph). When such assignment 
exists the instance is called SAT, and one wants to find it. Most of the interesting CSPs 
are NP-completelsl,^: in the worst case the number of operations needed to decide whether 
an instance is SAT or not is expected to grow exponentially with the number of variables. 
The interesting limit for random CSP is the thermodynamic one where both the number N 
of independent variables and the number M of constraints go to infinity at fixed constraint 
density a = M/N. The most intriguing phenomenon is certainly the appearance of sharp 



thresholds 
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At some critical ratio as the probability of existence of solutions jumps 



from one to zero. Just below such threshold, most of the known heuristic algorithms are 
observed to undergo a dramatic slowing down. Such phenomenon has been put in connection 
with the onset of a clustering phase, where the space of solution becomes divided in a large 
(exponential) number of different clusters (or states) and variables develop non trivial long 
range correlations. 

Scope of this study is to go a step further in the statistical physics analysis of the con- 
nection between clustering and the behavior of algorithms. By an analytic estimate of the 
internal entropy of the clusters found by different algorithms on large problem instances, 
and by a large deviation analysis of clusters distribution with respect to their internal en- 
tropy, we are able to display which type of clusters are addressed by different algorithms. 
For random CSP in the clustering phase, we observe that local search algorithms may be at- 
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tracted by a large spectrum of clusters (not surprisingly as it happens in out-of-equilibrium 
physical systems). Quite surprisingly we also show that there exist simple message-passing 
(MP) processes that are capable of finding efficiently solutions even in the harder region 
where the thermodynamically dominant clusters become frozen. In such region local search 
algorithms are observed numerically to undergo an exponential slowing down due to the 
global rearrangements needed to correct the errors made along the search process. On the 
contrary, the MP processes that we study continue to find solution efficiently by addressing 
clusters which are more rare than the dominating ones (i.e. those which would be seen by 
sampling solutions with uniform measure over the solution space). These results, together 
with the evidence coming from fully connected CSP that even frozen solutions may be found 



by MP jo], shed new light on how MP algorithms can be utilized and suggest that some 
further algorithmic progress may be at hand. 

In what follows we first provide a very brief review of the known results and next apply 
our arguments to the so called Bicoloring problem, which has some analytical advantages 
compared to other NP-complete problems like K-SAT or Coloring while retaining all the 
conceptual features. 

The paper is organized as follows. In sect. [TTl an intuitive introduction to clustering is 
given. The definition of the specific problem under study is provided in section IIIII together 
with a summary of previously known results. The cavity method for large deviations is 
presented in section HVl The numerical methods used to solve the cavity equations and 
extract the complexity curves are described in section |Vl In section |VT] the equations are 
solved in some special cases in order to obtain the main properties of the phase diagram of 
the problem. The algorithms used to find solutions and locate them in the entropy landscape 
are described in section IVIII Section IVIIII is devoted to summarize the main results of the 
paper and to make some concluding remarks and discuss perspectives. In the appendices 
we report the details of deriving the cavity equations and some quantities that are essential 
in computing the free energy. 

II. GEOMETRY OF SOLUTIONS AND FREEZING 

The set of solutions of a random CSP should be thought of as a portion of the phase space 
which may undergo a fragmentation into clusters for values of the density of constraints right 
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below the SAT threshold. This scenario can be made rigorous in few cases, the simplest 
one being the random XOR-SAT problem [l^. : the density of constraints were clustering 
appears corresponds to the percolation of particular structures in the underlying graph of 
constraints. This fact can be used to define clusters which, by linearity of XOR-SAT, turn 
out to be all identical. One may prove that a finite fraction of the variables have to be 
frozen, that is must take the same value in all the solutions belonging to a cluster. This 
picture is however far from general: both the definition of clusters and the analysis of their 
fluctuations in size are difficult tasks, which require the application of the cavity method in 
a rather advanced setting Q, Q, Q, [l8|. One important feature of clusters in CSPs 
concerns the presence or absence of frozen variables. It may happen that clusters with frozen 
variables coexist with totally unfrozen clusters of larger internal entropy, with big effects on 
the hardness of the associated combinatorial optimization problem. The intuitive reason why 
the presence of frozen variables is believed to be relevant is well summarized by the idea 
of rearrangements ISj: "given an initial solution of a CSP and a variable i that one would 
like to modify, a rearrangement is a path in configuration space that starts from the initial 
solution and leads to another solution where the value of the i-th variable is changed with 
respect to the initial one. The minimal length of such a path is a measure of how constrained 
was the variable i in the initial configuration. In intuitive terms this length diverges with 
the system size when the variable was frozen in the initial cluster" . The idea is that when 
freezing takes place in Gibbs states, then the rearrangements are responsible for a critical 
slowing down of local search algorithms. On the contrary, when dominant clusters are not 
frozen, even relatively simple local algorithms may find solutions by incrementally adding 
constraints until the full problem is satisfied. Recent arguments and numerical studies, have 
shown that one can still obtain a solution beyond the dynamical transition as long as the so- 
called jamming transition has not occurred 19(]. Following Ref. 19(], one can imagine adding 
the constraints one by one, in each step recording the number of fiips that are required in 
order to find the new solution. Close to the iammin g tr ansition the number of fiips diverges 
and makes it difficult to find a solution to the CSP 18|. 

A Constrained Satisfaction Problem is defined by a set of constrains C = {laio^ ^ 
{0, l}|a = 1, . . . , M} on a number of variables. The constraints depend on the configuration 
of variables g_ = {cri\i = 1, ■ ■ ■ , A^} and the problem is called satisfiable if all constraints are 
satisfied, i.e. la = l,Va. A solution of the problem is a configuration of the variables that 
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satisfies all constraints. In analogy with statistical physics models, we define the energy 
E[gi\ of configuration g_ as the number of unsatisfied constraints in g_. Given an instance of 
a CSP, one is interested in deciding whether it is satisfiable (i.e. E[a\ = 0) and, in such a 
case, in explicitly finding solutions to the problem. 

More in general, one can define an ensemble of instances of the problem considering all 
possible random assignments of the M constraints among variables, with fixed density of 
constraints a = M/N. Varying a it was shown that the system passes from a phase in which 
it is always possible to find a solution, the SAT phase, to the UNS AT phase where a fraction 
of the constraints are not satisfied. Examples of studies for root problems such as Random 
Satisfiability Problem, XOR-SAT and Graph Coloring can be found in 20, 21, 22, 23I, 24, 25|. 
The main tool for analyzing the satisfiability of typical problem instances is the cavity 
method, orig inally developed to study the thermodynamic properties of diluted spin-glass 



systems 



26l | and recently reconsidered in the context of CSPs 22|, |27|, |28(]. Actually, the 



cavity method at the ensemble level allows to study the typical properties as well as large 



deviations from typical behaviors 
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291]. The key feature of the cavity method which 



is of interest for computer science stems from the discovery that it can be converted to an 



algorithm for analyzing single problem instances 
the random combinatorial problems. 
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331, becoming an efficient solver for 



III. DEFINITION OF THE PROBLEM AND KNOWN RESULTS 

Consider a hypergraph of A^ nodes i = 1, . . . ,N and M hyperedges a = 1, . . . , M. For 
simplicity we consider regular random hypergraphs, or {K, L)-hypergraphs, where each hy- 
peredge connects K nodes and each node contributes in L = KM/N = Ka hyperedges. A 
node in this hypergraph has state cTj G {0, 1} and each hyperedge imposes a constraint on 
the values of the associated variables. In the bicoloring problem this constraint just forbids 
the configurations in which all the nodes belonging to an hyperedge have the same value. In 
the context of circuit logic the bicoloring problem is known as Not-All-Equal-Satisfiability 
(NAE-SAT) problem, in physics it is a spin model with anti-ferromagnetic interactions. 

We may represent bicoloring as a factor graph [30]. This is a bipartite graph where the 
variables and constraints are represented with two different types of nodes, variable and 
function nodes, respectively. Each function node is connected to all the variable nodes that 



5 



FIG. 1: A regular random factor graph with function nodes (squares) of degree K = 3 and variable 
nodes (circles) of degree L = 2. 



should satisfy the associated constraint. Figure [T] shows an example of regular random factor 
graph. 

The hypergraph bicoloring problem is NP-complete for K > 3 [si^ . The case K = 3 with 
a Poisson degree distribution for the variable nodes has already been studied in 321]. The 
authors found dynamical and SAT/UNSAT transitions within the single and multiple cluster 
approximations. In spin glass language these approximations are called replica symmetric 
(RS) and one-step replica symmetry breaking (IRSB) approximation, respectively. In the 
latter case the authors only consider the most numerous clusters. 

The intensive entropy s is defined by the number of solutions Af = e^*. Using Bethe 
approximation in the replica symmetric phase we find the entropy as 

s""' = ln[2(l - ^)^] -{K- l)aln[l - (1) 



This quantity vanishes at 



where for K ^ 1 gives 



;r2^-Mn2(l + 0(-^)). (3) 



1 

•2^ 

If there exist more than one cluster of solutions we define the complexity S by AC; = e^^ 
where Afc is the number of clusters. Notice that for very large N the above complexity is 
dominated by typical clusters. In IRSB approximation and considering only typical clusters, 
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428 
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905 


1594 


1592 
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1894 


3546 


3543 



TABLE I: Numerical values of and Ls (in the RS and IRSB approximations). In each case we 
have given the smallest integer degree larger than or equal to the precise value. 



the complexity reads [32|] 
where 



^ ln[Ai] ~{K- l)a ln[l - hi - r,){l - ^)], 



= 2(1 - V^)^ - r/^ 



and 7] is determined by the following equation 



.^1-2[1(1-^)]-. 



(4) 



(5) 



(6) 



A nontrivial solution [rj ^ 1) for the above equation results to a nonzero complexity. Let us 
assume K ^ 1 and find the point where for the first time a nontrivial solution appears. We 
try T] = 1 — ^TT^ and find c in a self-consistent way. From the above equation we obtain 



c . K - 1 _c£^i 
- ~ exp[ —e 2K ]. 

It means that to have a finite solution for c we need L diverges as 

~ 2^^[lnK-ln2 -lnln(^) + o(l)]. 

At the SAT/UNSAT transition T^typ vanishes and we can use this fact to determine as- This 
value behaves, asymptotically, as af in Eq. [31 In table [T] we compare the numerical values 
of and Lg obtained with the above methods. 



(7) 



(8) 



IV. CAVITY METHOD: A LARGE DEVIATION STUDY 



A more complete picture of the distribution of clusters is given by a large deviation study 
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2^- We define the partition function at zero temperature as 

Z = j:EU^da)e''^^^'''-'''^". (9) 

a_ a 

Here x is a Lagrange multiplier that controls the distance between the solutions and a 
reference point And aga = ^ ^('^)} where V{a) is the set of neighbors of function 
node a. For x = we recover the total number of solutions. If there is only one cluster of 



solutions we can safely use the standard Belief Propagation (BP) equations [30| to obtain 
an estimate of the cavity marginals (see appendix A) 

/^.->a(a.) = E f n h{crab)[ n f^j^^j)]] e'^""'-'^' . (10) 

Here Zi^a is a normalization constant, V{i) is the set of neighbors of variable node i and 
o'di^a = WjlJ ^ b G V{i) \ a}. We will write the above equation in short as 

/ii-^a(o-i) = S[flj-,b]- (11) 

We also define the free energy /(x) as 

d 

where d = ■^J^ii'^i — (^tY and e^*'^'^^ is the number of solutions at distance d from the 
reference point. In the Bethe approximation 

/(x)=5:A/,-E(K,-1)A/„ (13) 

i a 

where 

e^^f^ = Y^[ n /Xa-..(a.)]e-(--'^*)^ (14) 
e^^f^ = Y^I^(aoa) n /^-a(a.). 

Using the above free energy we can determine the entropy s{d) by a Legendre transform 
s{d) = min [/(x) — xd{x)], d{x) = ^ . (15) 
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If the replica symmetry is broken, we would have different clusters of solutions and the 
cavity marginals fluctuate from one cluster to another one. This is described by a generalized 
partition function defined by 

Here c labels the clusters and is the internal entropy of cluster c. Again in the Bethe 
approximation 

NJ'{m)=Y.\YiZ,-Y,{Ka-l)\nZa (17) 



where 



= / n n dV,^a[^^,^a]e'^'''''\ (18) 

a&V(i) jeV{a)\i 



Having the generalized free energy we can determine the complexity S(s) by a Legendre 
transformation 

= min [jF(m) — ms], 3(171) = — . (19) 

dm 

In appendix B we have explained the origin of the above relations. Notice that Asj = 
Afi{x = 0) and As a = Afa{x = 0), where the free energy shifts are given by Eq. UM 
Moreover, Vi^alfJ'i^a] is the probability that, in a randomly selected cluster, we find the 
cavity marginal fii^a on edge (i, a) of the factor graph. This probability distribution is 
determined by the following self-consistency equation: 

P.^4/i.^J = / n n c^P,^,[/x]e-^^^>5(/i.^„-5[/i]), (20) 

where Zi^a is a normalization constant and S[fi] is the same as in Eq. [11] with x = 0. 
The factor e™^^''* is to sample correctly the clusters when we add the new variable i. Let 
us multiply the two sides of Eq. [2D] by 2^i^a{<^), to find the new probability distribution 
Qi-,a[fA — '^f^i^a{cr)'Pi^a[fj] that will bc uscful in the special case of m = 1. In the right 
hand side we can replace /ij^a(o") with its definition in Eq. [TO] Rearranging the terms we 
get 

Qr-^aM = ^/E n Ibicrm) n (^rf2;i..M)e("^-'^^^"'^(-".^a-5[/i]).(21) 

^i^a J cTQi^a b£V{i)\a ieV(6)V ^ 




FIG. 2: Population dynamics works with a population of fields on each link of the factor graph. 

In the following we will split Vi^a[^^ into frozen and unfrozen parts as 

V.^ai^ii^a] = ^-^[S{r) + 6{r - 1)] + np{r), (22) 

where /ij^a(O) = r, /Xj^a(l) = 1 — r and p(r) is the probability distribution of unfrozen 
fields. The above arguments become much simpler in random {K, L)-hypergraphs where all 
the links and nodes are statistically equivalent. For example, Eq. [T7|is replaced with 

^(m) = ln[ J APl/ije'"^^^'] - a{K - 1) ln[ J Z},P[/i]e'"^^^»], (23) 

where / DiV[fi\ and / DaV[fi] denote the integrations over all the cavity marginals that 
contribute in Asi and Asa, respectively. 



V. ENTROPY LANDSCAPE: NUMERICAL METHOD 



The mai n eq uation that we should solve is Eq 
method |27|, 



One can use the population dynamics 



28l | to get rid of summing over a large number of continuous variables. 



A. In a single hypergraph 

Given the factor graph we represent Pi_»a[/i] by a population of Afp cavity probabilities 
(or fields), Fig. [2j At the initial point all the cavity fields are of frozen kind, with equal 
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probability for r = and r = 1. With this initial condition we will not miss a nontrivial 
solution with frozen fields, if any. At each step of the population dynamics we select a link 
(i — > a) randomly and do in the following way: 

• For each b G V{i) \ a and j G V{b) \ i: randomly select a member of the population 
on link [j b). 

• Using these (L — 1){K — 1) fields: calculate the new fii-,a by Eq. [TUl 

• Calculate the weight Wi^a = e^^^^^ from Eq. [T3]at x = 0. This weight is zero if there 
is any contradiction. 

• With probability ^]^a% replace a randomly selected member of the population with the 

i — 'a 

new one. Here is the maximum weight Wi^a observed in the evolution from the 
beginning. 

In a sweep of the algorithm we choose all the links of the factor graph randomly. Having 
the populations we can obtain the free energy as 

NT{m) =Y.\nZ,-{K-l)Y,\nZa, (24) 

i a 

p- _ / mNAsi\ 

\C Ipopi 

mNAsa\ 



popi 



where {■)pop means averaging over the populations. We stop the updates as soon as the free 
energy, and so the weights w^^, reach the steady state. Then the entropy reads 

iVs(m) = ^A^- (ir- 1)5] Ai:, (25) 



As,; 



As. 



pop 



a 

(AsiC'^^^^O 

/f,mNAsi\ ' 
/pop 

{ASad )pop 



\c /pop 

Figure [3] shows the results for choices of the factor graph parameters that correspond to 
different phases. 
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FIG. 3: Complexity vs entropy for = 4, L = 18, 19 with Afp = 10^ (top) and K = 6, L = 121 
with Mp = 2x 10^ (bottom) in one-hnk approximation . Fihed and empty symbols represent frozen 
and unfrozen parts, respectively. The statistical errors are about 0.001. 

B. In one-link approximation 



In a regular random hypergraph all links of the associated factor graph are equivalent. 
Therefore we can forget about different populations on different links and work with only one 
large population of fields. The way we obtain the stationary distribution V[fi] is the same as 
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FIG. 4: Comparing in a single (4, 19)-hypergraph (N = 10^, Mp = 10^) and in one-link 
approximation with Afp = 10^. The statistical errors are about 0.001. 

above. The only difference is that we always select the fields from the single population. In 
Fig. Hlwe compare the complexity computed on a single (4, 19)-hypergraph of size = 10^ 
with the complexity obtained in the one-link approximation. 

VI. ENTROPY LANDSCAPE: ANALYTICAL RESULTS 

To locate different phase transitions in the solution space we need to calculate the gen- 
eralized free energy JF which is given in terms of Zi and Za in Eq. [181 These quantities in 
turn depend on the fraction of frozen variables vr and p(r) which should be determined by 
Eq. |20]for V[fi]. In the following we study some special cases that allow us to calculate the 
above quantities and determine the phase diagram of the problem. For clarity here we only 
state the results of calculations that will be presented in more details in appendix C. 
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A. The case m = 1 



The study of m = 1 clusters is relevant in determining the dynamical, rigidity and 



16|. These are in fact the thermodynamically relevant clusters 



condensation transitions 
before the condensation transition. 

For m = 1 the generalized free energy reads 

^ = In (2[1 - ^]") -{K- l)a In (l - ^) • (26) 

Comparing with the RS entropy s^^ in Eq. [T]we see that jF(m = 1) = S(m = 1) + s(m = 
1) = s^^ . Therefore, as long as the m = 1 clusters are the thermodynamically relevant ones 
the RS approximation gives the correct total entropy. From Eq. [2T] we can also find the 
probability of having a frozen field with r = 1 

For small L the above equation has only one solution, vr = 1, where the m = 1 clusters 
are unfrozen. Increasing L, one reaches the rigidity point Lr where another solution vr 7^ 1 
appears. It is where a finite fraction of the variables in these clusters become frozen. We 
find that for K < 6 the rigidity transition always happens after the SAT/UNSAT transition. 
Simplifying the above equation we obtain 

n=[l- V^Z^]^"'- (28) 
Assuming ^ 1 and vr = one obtains 

c^i^exph^l^e-], (29) 

which suggests 

~ 2^-^e'^[lni^-lnc + o(l)]. (30) 

We see that, like L^, the leading term in Lr diverges as 2^ Ini^'. Compare it with the leading 
term of L^^ which scales as 2^K. 



B. The case m = 



The typical or most numerous clusters are the m = ones. The study of these clusters 
provides us with an estimate of the SAT/UNSAT transition (in that they are the last clusters 
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to disappear). Indeed the previous studies of the complexity in IRSB phase focus on these 
type of clusters. For m = the generalized free energy reads 

^ = In (2[1 - i'-^f-r - [1 - -{K- l)aln (l - 2(1^)^) . (31) 

Using Eq. [20] one can easily write an equation for the fraction of frozen marginals 

^ ' ' (32) 



2 2[1 - (if^)^'-i]^-i - [1 - 2(i^)^-i]J 



The above equation is another way of writing Eq. [6] that has been obtained in the previous 
studies. Notice that 

vr = (33) 

with Al^i and 7] given by Eqs. [5] and El A nontrivial solution for vr appears at where for 
the first time a maximum appears in the curve S(s). According to Eq. [19] the complexity 
of m = clusters is S(m = 0) = T. The point that this quantity vanishes defines the 
SAT/UNSAT transition Lg. One can show that, like Lf"^, the leading term in Ls scales as 

C. The case tt = 

This case is relevant to study very small clusters or close to the SAT/UNSAT transition 
where almost all variables are frozen and vr ~ 0. Notice that solving numerically for S(s) is 
a heavy computational job and it would be useful to have other approximation methods to 
get a good estimate of the complexity. When vr = the generalized free energy is given by 

im-l i\ri 2 1 ,r\ _x , 2 



^ = In [2{r^-^ - 1)[1 - + 2[1 - -{K~l)a\n[l--^]. (34) 

Taking derivatives we obtain the entropy as 



2(2— l-l)[l-^]^ + 2[l-^]^• 



(35) 



With the above quantities we can obtain the complexity of different clusters. In Fig. [5] we 
have compared this complexity with the one obtained numerically in the one-link approx- 
imation. As the figure shows the agreement is good especially for the smaller and frozen 
clusters. 
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FIG. 5: Comparing S(s) for K = 6 and L = 121 (in one-link approximation with Afp = 2 x 10^) 
with vr = 0). 

Close to the condensation transition the m = 1 clusters are nearly completely frozen and 
we expect the vr = complexity to give a good estimate of S(m = 1). From the above 
equations we obtain 

S(m = 1) ^ In (2[1 - ^]^) -{K- l)aln (l - " (l^^) l^^. (36) 

We use this approximated complexity to determine the condensation transition, L^, where 
S(m = 1) vanishes. After some algebra we find that for K ^ 1 

L,^ir2^-i(ln2-^ + o(l)). (37) 

Notice that the leading term in is exactly the same as in Lf^. 



D. The case vr = 1 



The complexity can be nonzero even when the frozen fields are absent. In this case we 
can exactly compute the free energies of m = 0, 1,2 clusters. We use this fact to find an 
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approximated free energy for m G [0,2]. Having jF(m = 0), jF(m = 1) and jF(m = 2) we 
use the Lagrange interpolating polynomial function to write JF(m) around m = 1 

J^(m) = -T{m = l)m(m - 2) + J^{m = 2)^^ (38) 

where we have used the fact that for tt = 1, jF(m = 0) = 0. The resulted entropy and 
complexity are 

s(m) = -2(m - 1)J^ + (m - ^)J^(m = 2) (39) 
S(m) = rr?\T{m = 1) - ^J^(m = 2)]. 

As we show in appendix C, the free energy Tim = 2) depends on the second moment of the 

It turns out that S(m = 1) is zero as long as (r^) = 1/4, i.e. p(r) = 5(r — 1). The complexity 
becomes nonzero only when equation |40] suggests a nontrivial solution. We can rewrite Eq. 
iOlas 

l _ x^~^ 1 
^ " " l + (l + x)^-i(2^-i-2)]'"'' = 2p) ~ ^^^^ 

Taking K ^ 1 and x = we find 

c^^exph^e-^]. (42) 

The equation suggests that 

Ld^2^-^e'[\nK -\nc + o{l)], (43) 
which behaves very similar to in Eq. [HI 

E. The case of integer m 

Suppose that we have computed Ziinin) and Zairrin) for m„ = 0,1,..., — 1- In 
appendix C we write explicit relations for these quantities when p(r) = 5{r — 1). We can 
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FIG. 6: Comparing S(s) for K = 6 and L = 121 (in one-link approximation with Afp = 2 x 10^) 
with the complexity that has been obtained by interpolation approximation {Nm = 10). 

find an approximated free energy that interpolates between the free energy values at integer 
m's, JF(m„). To this end we use the Lagrange interpolating polynomial 



To obtain the free energy we also need to determine vr from Eq. [211 This equation depends 
on 2j_»(j(m) which again can be obtained in the above interpolation approximation. Using 
the above approximation we can obtain vr and the free energy as long as < m < m/. Here 
mj is the maximum value of m such that frozen variables do exist. Indeed for m > rrif the 
fraction of frozen variables is zero and with a trivial p(r) we find a zero complexity which 
is not always correct. The number of interpolation points is chosen such that the resulted 
complexity has a reasonable behavior. In Fig. [6] we compare the complexity obtained in 
this way with the one we obtained by the population dynamics. As the figure shows with 
the interpolation approximation we are able to reproduce the population dynamics results 
in the interval < m < mj. With the above approximation we can find an estimate of the 
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Lc 


Ls 
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20 


20 
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49 


53 




53 


53 


6 


114 


119 


126 


130 


130 


7 


250 


257 


297 


306 


307 


8 


534 


543 


663 


705 


706 


9 


1122 


1136 


1473 


1591 


1592 


10 


2333 


2356 


3202 


3543 


3543 



TABLE II: Numerical values of degree L at different transition points obtained in the IRSB ap- 
proximation with the methods described in the manuscript. 

freezing point, Lj, where all clusters become frozen. In table [TTl we have compared Lj with 
degree values at the other phase transition points. 

VII. ALGORITHMS AND THE ENTROPY LANDSCAPE 

In this section we will use different algorithms to find some solutions of the bicoloring 
problem close to the SAT/UNSAT transition. We show that a smoothed BP decimation 
algorithm is able to find solutions even beyond the rigidity transition L > L^. We will also 
see that, within our level of approximation and for fixed parameters, the algorithm always 
finds solutions that belong to the same kind of clusters. Interestingly enough, beyond the 
rigidity transition, we find solutions to clusters that are exponentially smaller in number 
compared to the thermodynamically relevant ones. 

A. Cavity method as an algorithm to find solutions 

Warning Propagation (WP) is an elementary message passing algorithm that uses cavity 
messages to find a solution of a constraint satisfaction problem. On each edges of the factor 
graph we define cavity messages Wa^i G {—1, 0, 1}, Wi^a G { — 1, 1}; The warning Wa^i = 
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FIG. 7: Warning propagation on a factor graph. 



means that variable i is free to take any value without worrying about constraint a. On 
the other hand, if Wa-,i = —1, 1, variable i should take a value that satisfies constraint a. 
The message Wi^a = —1,1 represents the color that variable i has to take to satisfy the 
other constraints. Given a factor graph we start with the initially random values of Ws 
and in each sweep of the algorithm we update all the messages, see Fig. [71 For example the 
messages on edge {i, a) are updated in the following way: 

-1 if all Wj^a^ are 1, 
Wa^i = <( 1 if all Wj^a's are -1, (45) 
otherwise, 
Wi^a = stgn{ Wb^i). 

beV(i)\a 

If the algorithm converges and no variable receives contradictory warnings we can deter- 
mine the solution according to the warnings. It has been shown that on tree factor graphs 
the above algorithm always converges and gives the solutions. 

More sophisticated message passing algorithms that work much better than WP are Belief 
Propagation Decimation (BPD) and Survey Propagation Decimation (SPD) 33|. In these 
algorithms one replaces the messages Wa^i, Wi^a with probabilities that come from single 
cluster (RS) or multiple cluster (IRSB) approximation. For example, in BPD we have the 
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believes /ia^i, /ij^a that are updated according to the BP equations 



fii^aicTi) = n /^fe-i(^i)' (46) 

6ey(i)\a 



^ab\i j&V{b)\i 

Starting from random initial values for the /i's we update them to reach a fixed point of the 
dynamics. After convergence we define the local fields 

^.-In^, (47) 

where 

= Y n f^b~>i{^i)- (48) 

bev{i) 

Then the most biased variable is fixed according to the sign of its local field. Then we 
simplify the factor graph and repeat the above procedure till we obtain a paramagnet where 
Hi = for all the remained variables. At this stage we can run a local search algorithm to 
complete the solution of ourproblem. In this paper we are going to use a smoothed version 
of BPD first introduced in 6j. The main idea is to introduce external fields hi that at each 
step of the algorithm are updated according to the local fields. At the end, the external fields 
determine the preferred values of the variables. We call this algorithm Belief Propagation 
Reinforcement 3^ . 

More precisely, the BP Reinforcement algorithm works as follows: 

• Start with random initial values for < fii^a, IJ-a^i < 1 and —6 < hi < 6 {6 <^ 1). 

• For t 1, . . . , tjnax ■ 

— Update all the /x's according to the BP equations in presence of the external 
fields: 

/ii^a(o-i) = ^ n t^b~*Mi)^ (49) 

fib^i{ai) = h{(yab) n t^j-'bicTj). 

f^dbXi j&V{b)\i 

— Obtain local fields = In and with probability 1 — 1~'^ update the external 
fields as hi ^ hi + sign{Hi)6. 
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— li g_= {<Ti = sign{Hi)\i = 1 , . . . , N} is a solution, return SOLUTION = g_. 

Notice that instead of fixing variables one by one, here all the external fields are updated 
(with a rate that increases with t) during the run time. Moreover, in this algorithm we do 
not need to simplify the factor graph after each decimation. 

For comparison, we also use other algorithms like Simulated Annealing (SA) and Focused 



Simulated Annealing (FSA) to find solutions [35|. In the SA algorithm we start from a 
random configuration at small inverse temperature = l/T, and decrease the temperature 
slowly. At each temperature we select all the variables in a random sequential way and flip 
a variable with probability min{l,exp{—/3AEi)}. Here AEi is the change in the number of 
unsatisfied constraints, if we accept to flip variable i. After a sweep the inverse temperature 
increases by A(3. In the FSA algorithm we do the same as SA except that to flip a variable 
we only select those that belong to unsatisfied constraints. 

To check the algorithms and their solutions we consider two different cases; (a) (4, 19)- 
hypergraphs {Ld < L < Lg), just after the dynamical transition and before the SAT/UNSAT 
transition where the thermodynamically relevant clusters are unfrozen. (b) (6, 121)- 
hypergraphs (L,, < L < Lf), where the thermodynamically relevant clusters are frozen 
but still there are some unfrozen clusters. 

In case (a) we are able to find some solutions with all BPR, SA and FSA algorithms in a 
reasonable time. On a (4, 19)-hypergraph of = 10^ variables it takes about 20 hours for 
SA and FSA algorithms to find a solution whereas BPR algorithm does the job in about 10 
minutes. In SA and FSA algorithms the parameters are = 0.1 and Aj3 = 10~^. In BPR 
algorithm we used 7 = 0.05 and 6 = 0.01. With these parameters we could obtain a solution 
at the end of almost all runs. 

In case (b) we could only find solutions with the BPR algorithm . Both SA and FSA 
algorithms were not able to give a solution in a couple of days even for = 10^. However, 
on a hypergraph of size N = 10002, the BPR algorithm still returns a solution for every 
instance after a few hours in about 20 percent of the runs starting with different initial 
conditions. 

We expect the performance of the algorithm could be improved by further optimization. 
We did not pursue this line as we are interested in a proof of concept rather than in optimizing 
algorithms over academic benchmarks. 
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FIG. 8: s{d) for a few solutions in a (4, 19)-hypergraph with N = 10000 (top) and a (6, 121)- 
hypergraph with N = 10002 (bottom). Please note that some of the curves have been selected to 
show the extremal behavior of s{d). 

B. Entropy versus distance from a solution 

Suppose that we have the number of solutions at distance d from a given solution, e^'^^'^\ 
If the solution belongs to a sphere-like cluster of solutions then s{d) increases for d < d* 
and becomes zero at d* . Clearly, for large A^, the entropy s* is a good (under)estimate of 
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the total entropy of the cluster. If the solution space is more complex we still expect s{d) 
to be, up to a distance d*, an increasing function of d. It may exhibit a maximum at d* and 
decrease for larger distances. In any case we can take s* as an approximation to the entropy 
of the cluster. 

To obtain s{d) for a given solution a* and distance d, we use a local BP as follows: 

• Start with random initial values for < fii^a, f^a^i < 1 and a reasonable value of x. 

• For t 1, . . . , tmax • 

— Update all the /I's according to the BP equations around the given solution: 

ga:((Ti-cr*)2 

fXi^aio-i) = Yl f^b^iio-i), (50) 

^i^a beVii)\a 

— Obtain f(x) (Eq. [T3l) and find the new x such that d = ^^q^^ . 

— If converged, calculate the entropy (Eq. [15]) and return s{d). 

The SP version of this algorithm has been used in 3^ to obtain the complexity as a 
function of distance from a solution in fc-XOR-SAT problem. 

Figure [8] shows s{d) for a number of solutions obtained with BPR algorithm. As the 
figures show, we do not always reach the extrema point of the curve s{d). It is even more 
difficult to observe the decreasing part of the entropy. Indeed, as we approach the maximum, 
the convergence time of the algorithm increases rapidly and exceeds the upper bound tmax = 
1000. This happens, probably, when we encounter the other clusters where replica symmetry 
approximation is not valid any more. However, we could observe the decreasing part of s{d) 
for small values of A^, where computational time is not too large. 

Finally, notice that one could obtain the cluster entropy by summing over all solutions: 
exp(A^s) = X^d 6xp(A^s((i)). However, for large A^ the maximum entropy has the dominant 
contribution to the cluster entropy. To show this we calculated the cluster entropy s, using 
the above definition, and compared it with our estimation s* which is the maximum entropy. 
For instance, for three solutions of a (4, 19)-hypergraph of size A^ = 10000 we obtain 6s = 
s-s* = 0.000115, 0.000118, 0.000121 whereas s* = 0.052, 0.050, 0.0497, respectively. We see 
that the differences are very small compared to the cluster entropy. 
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FIG. 9: Comparing the attractive clusters of different algorithms in the entropy landscape. The 
large circles show the typical and thermodynamically relevant clusters. All FSA, SA and BPR 
algorithms find solutions in the interval between frozen and thermodynamically relevant clusters 
with entropies s{FSA) < s{SA) < s{BPR). In each case we found 20 solutions on a (4, 19)- 
hypergraph of size = 10000. The standard deviations in the entropies is about 0.002 

Another source of systematic error is that the curves do not always reach the real ex- 
tremum. However, as figure [8] shows, we observed that the maximum entropy is very close 
to the real one. Indeed an extrapolation of the curves to higher distances gives a correction 
which is about 0.001. 



C. m = lvsm7^1 solutions 



Using the method described in the previous subsection we can now locate our solutions 
in the entropy landscape to see to which clusters they belong. In Fig. [9] we show the 
attractive clusters of different algorithms after the dynamical transition and before the 
rigidity transition. In this case BPR finds solutions in clusters that are very close to the 
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thermodynamically relevant ones. We think that the difference is due to the systematic 
errors in underestimating the cluster entropy. In addition, there is also some statistical 
error in the entropy value of the curve points. The figure also shows that SA ends up in 
smaller clusters compared to the thermodynamically relevant ones. Moreover, FSA finds 
solutions in much smaller clusters close to the frozen ones. 

The above results have been obtained with parameters given in section IVII A[ We found 
that by decreasing 7 (in BPR) or A/3 (in SA and FSA) the algorithms find solutions in larger 
clusters. In fact in a very slow annealing scheme, where one equilibrates the system at each 
step of the algorithm, we will finally find a solution in the thermodynamically relevant 
clusters. 

Notice that all the algorithms end up in the region between the frozen and thermody- 
namically relevant clusters. Indeed, when is large it is very difficult to find a solution in 
the frozen clusters; each time we flip a frozen variable we go to another cluster of solutions 
and so an extensive number of flips is needed to accordingly rearrange the variables. On the 
other hand, it is not also easy to find a solution in very large clusters that are exponentially 
less numerous than the thermodynamically relevant ones. 

As already mentioned, beyond the rigidity transition we could only find a solution by BPR 
algorithm. Figure [10] shows that in this case the solutions are very close to the boundary 
between frozen and unfrozen clusters. The difference is about the statistical errors and the 
error that we make by underestimating the entropy. We see that when the thermodynam- 
ically relevant clusters are frozen the algorithm ends up in the smallest unfrozen clusters. 
These are exponentially more numerous than the other unfrozen clusters. 

, belong to the unfrozen clusters. This 
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381]. Given a solution one checks if a 



We have checked that our solutions do, indeed, 
can be done with the so called Whitening process 
variable can be flipped without violating any constraint. If so, that variable is unfrozen and 
is denoted with The process goes on by checking one by one the other variables with 
the additional rule that a constraint with at least one star variable is always satisfied. This 
process is repeated up to the fixed point where the number of star variables is fixed. If a 
solution belongs to an unfrozen cluster then at the end all the variables would be 
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FIG. 10: Comparing the attractive clusters of the BPR algorithm with the typical and thermody- 
namically relevant clusters (large circles). In this case the BPR algorithm finds solutions in the 
most numerous unfrozen clusters. The result obtained from 20 solutions on a (6, 121)-hypergraph 
of size N = 10002. The standard deviation in the entropy is about 0.002. 

VIII. CONCLUSION 

In summary we applied the large deviations cavity method to study the phase diagram 
of the bicoloring problem on regular random hj^ergraphs. Working in the one-step replica 
symmetry breaking framework we located the various phase transitions characterizing the 
structure of the solutions landscape at both the ensemble and single instance level. Notice 
that we did not check the stability of IRSB solutions toward higher order replica symmetry 



breaking. But, as other studies show [39|, |40l], we expect IRSB solutions to give the correct 
qualitative picture and even exact results close to the SAT/UNSAT transition. 

We also used different algorithms to find solutions and to locate them in the entropy 
landscape of the problem. This provided a rough characterization of the relations existing 
between the entropic properties of clusters of solutions and the different algorithms used to 
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find them. 

From an algorithmic point of view, the algorithms based on simulated anneahng could 
not efficiently find solutions after the rigidity point. However, using BPR we showed that it 
is actually possible to go beyond the rigidity transition. In this case we obtained solutions 
that belong to the smallest and most numerous unfrozen clusters jisl . 
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APPENDIX A: CAVITY EQUATIONS IN THE RS APPROXIMATIONS 

We start from the partition function definition in Eq. M and derive the main equations 
in the first part of section IIV[ Let Zi^a{ci) denote the partition function in the absence 
of constraint a and when variable i has state cTj. Then, in the absence of constraint a, the 
probability of finding variable i in state cTj is 

^^i^a[<y^) = — (A-1) 
On the other hand, assuming a tree structure for the factor graph we can write 

^.^a(^.)= E f n n Z,^,{a,)\\e<'^^-*^\ (A-2) 

From the above equation we can derive a relation for the cavity marginals 

/^.^a(a.) = ^E f n h{aab)[ n -".^^(^.)] I e^'^"-'^^*^ (A-3) 

^i^a agi^^ \b&V{i)\a j<^V{b)\i / 

where Zi^a is a normalization constant. It is more convenient if we write the above 
relation as 

/ii^„(0-i) = — Yi (A-4) 

where 



28 



lih-^i{(7i) = ^ h{(7db) n I^J^b((^j)- (A-5) 

^db\i j€V{b)\i 

The free energy f{x) is given by In Z. In the Bethe approximation 



f{x) = - E(^a - l)A/„, (A-6) 

where A/j and A/^ are the free energy shifts by adding variable node i and function node 
a, respectively. 

Suppose that we have removed node i and all its function nodes from the factor graph. 
In this case the partition function reads 

z-{i,vm = n [ n Zj^a], (A-7) 

aeV(i) jeVia)\i 

whereas in the complete factor graph 

^= E n Uaea) n Z,^^{a,)]e^(-^-<'>\ (A-8) 

(^i^f^di aeV(i) jeV{a)\i 

Dividing the two quantities we get 



and this gives the shift in the free energy by adding variable node i 



eNAf.^Y: n fia^i{a,)e<'^^-<^\ (A-10) 

aeV{i) 

We can do the same procedure for a function node. If we remove function node a from 
the factor graph we have 

n Zi-^a, (A-11) 
ieV{a) 

whereas the complete partition function can be written as 



Z^J2U<^da) n Zi^ai'Ji). (A-12) 
<^aa ieV{a) 

So 
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and the shift in the free energy is given by 

^NMa ^Y^Uoaa) n ^^^^a{ai). (A-14) 

o"8a i&V{a) 

Finally using Eqs. IA-6t lA-lOl and I A- 141 we obtain the free energy f{x) in the Bethe 
approximation 



Ar/(x) =^lnZ,-^(/C-l)lnZ,, (A-15) 

i a 

Za = Y^Ia{.Crda) H l^i->a{.Cr,-^ . 
<^da i<^V{a) 

APPENDIX B: CAVITY EQUATIONS IN THE IRSB APPROXIMATION 

We start from the generalized partition function in Eq. [16] and explain the main equations 
in the second part of section IIVI 
In the Bethe approximation 



Ar^(m) = 5] A^, - E(/C - 1) A^„. (B-1) 

i a 

The generalized partition function can be written as 

Z = g"lA^[Sc.-{i,V(i)}+ASe,i]^ (B-2) 
c 

where Asc,j is the shift in the entropy of cluster c by adding node i and all its function 
nodes. At a fixed value of m the typical clusters would have nearly the same entropy and it 
seems safe to approximate e'"^^''' with its average value among the clusters, i.e. 

Z = [^e'"^^=.-o,vw}] J dV[fi]e'^^^'\ (B-3) 
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where is the probabihty distribution of /ij^a's among the clusters. So the shift in the 
generahzed free energy reads 

aeV{i) j€Via)\i 

In the case of adding a function node, similarly we find 

ieV{a) 

Notice that Asi and As^ correspond to the free energy shifts, Eqs. lA-lOl and IA-14t with 
X = 0. Using Eqs. IB-41 and IB-51 along with the Bethe form of the generalized free energy we 
obtain 



/ n n rf7>.->a[/^,^Je-^^-. (B-4) 



J n rfP^^a[/x.-.a]e'"^^^^ (B-5) 



iV^(m) =J2^nZ,- J2{K, - 1) lnZ„ (B-6) 

i a 

Z, = J DiVW 

where dV[ij\ is determined by Eq. [201 The normalization constant in this equation is 

z.^a = / n n dv^^.w^'''''^ (B-7) 



^niNAsa 



We represented V[fi] as 



beV(i)\aj&V(b)\i 



-Plf^] = + S{r - 1)] + 7rp(r), (B-8) 



where p(r) is the probability distribution of unfrozen marginals. By the normalization and 
symmetry of the problem 



J drp{r) = 1, (B-9) 



drrp{r) = j dr{l —r)p{r) = -. 



Then for the Qi^aifA = 2pi^a{(^)Vi^a[fA we have 



Q^^M = (1 - + 2nrp{r), (B-10) 
QLAf^] = il-n)Sir) + 2nil-r)pir). 

Notice that for cr G {0, 1} 

/ dpQUalf^] = 1. (B-11) 
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APPENDIX C: CALCULATING THE GENERALIZED FREE ENERGY 



To calculate the free energy T we need to obtain Zi^ Za, the fraction of unfrozen variables 
TT and p(r). Let us start from Eq. [2U] and multiply both sides of the equation with e"''*. 
Integrating over r allows us to get rid of delta function and we find 



^ [l + e~') + Ti I p{r)e-''dr (C-1) 



2 

n [ E I f \C-^)<^<^< n ldrip{ri)] 

ill Ao(n,, n) + n Xiint, r,)]-e-^ AoK,.,)^ 



where n° and nl are the number of frozen variables in V{b) \ i that take values and 1, 
respectively. Accordingly V*{b) \i is the set of unfrozen variables in V{b) \ i and nl is the 
number of its elements. Notice that n^s should satisfy + nl + nl = K — 1. 
Using Eqs. IA-31 and lA-lOl we write 

r = ^ n AoK,r,), (C-2) 



b£V{i) b£V{i) 



where 



j€V{b)\i 

Ai(n6,r6) = (l-(5„i^^_i)[l-5„o^o(l-'^n*,o) H i^-^l)]- 

j&V*ib)\i 

One can use Eq. IC-ll to write some equations for different moments of p(r). For example, 
for the second moment we obtain 

/ 



f;' (i^)*-;.": (C-4) 



X n fdrlp{rl)][ n AoKr,)+ U A^K r,)]-(n^M!^)2 

iey*(6)\i 6eV'(j)\a b€Vii)\a 

To compute we also need to find e-'^^*" in Eq. IA-141 



e^^- = (l-5„o,^)(l-5„.,^)[l-5„i,o(l-5n;,o) n ^^-^nO,o(l-^..s,o) n (l-rl)]. (C-5) 
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The normalization constants in Eqs. IA-21 and IB- 71 are 



beV{i)\a 



beV{i)\a 



(C-6) 



and 



2.^a= n [ E 



1* 



n< n dripiri)] (C-7) 



jev*(f')V 

6ey(i)\a bGV{j)\a 



Finally for the main elements of the generalized free energy we have 



= n [ E 



jey*('')V 

x[ n Mnb,rb)+ n ^iK'^^fc)]' 

beV{i)\a beV{i)\a 



and 



E 



1 



■ 1 ~ '""xn'^-L-"! 



(C-8) 



Unl^< n /cir>(O[(l-5n0,i.)(l-5ni,i.)(C-9) 



jey*(a) 



X I 1 - 5„i,o(l - 5„j,o) n - ^n«,0(l - 5n;,0) H " O 

In the following we will give the details of calculations in two special cases that need more 
explanation. 

1. The case vr = 1 

When TT = 1 the equation for Zj, Eq. IC-Sj is 



= n [ n / drip{ri)\ [11 f 1 - n I + n f 1 - 11(1 - hj) ; 

a=\,L j=\,K-\-' a \ j J a \ j J 



(C-10) 



For m = 0, 1, 2 we obtain 
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Zi(m = 0) = l, (C-11) 



Zi{m = 1) = 2[1 ^ 1^ 



2K-1 

Z,{m = 2) = 2[1 - + {rT~r + 2[1 - + - (r^)) 



2 ' 2\ft:-iiL , on ^1 /„2\\_fs:~iiL 



For from Eq. IC-91 we find 



i=l,K 

Again for m = 0, 1, 2 



Za{m = 0) = 1, (C-13) 
Z^im = 1) = [1 - 
Za{m = 2) = [1 - ^ + 2{rY + - (r^))^]- 

To complete the free energy calculations we need to find (r^) in m = 2 clusters. The 
second moment of p(r) can be obtained from Eq. IC-4t 



{r') = ^ n [ n fdrlpirl)] (C-14) 

A^a b=l,L-l j=l,K-l-^ 

xiu fi + n fi - n(i - ri)]r-'[i[ fi - n-^' 

b \ j J b \ j J b \ j 

If m = 2, the exact equation is 



Now we can use the Lagrange interpolating polynomial to approximate the free energy 

by 



^/ N ^/ ,(m — l)(m — 2) , , , ,m(m — 1) ,^ 
r{m) = T{m = 0)^ '-^ ^ - J^(m = l)m(m - 2) + J^(m = 2)^- ^.(C-16) 
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Since jF(m = 0) = we get 



J-'{m) = — jF(m = l)m(m — 2) + jF(m = 2) 



m(m — 1) 



s{m) = — 2(m — l)jF(m = 1) + (m — -)jF(m = 2), 
S(m) = m2[J^(m = 1) - ^J^(m = 2)]. 



2. The case of integer m 



Starting from Eq. IB-71 we expand Zi^a for integer m > to get 



n [ E 

X n /cir^p(rDAU^.,r6)Ar'K,r,)]. 



(C-17) 



(C-18) 



To simplify the results we approximate p(r) by bir — |). After some simplifications we 
obtain 



/ 



m . 

[1 -2( 

I r ^2 
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For we use Eq. IC-91 and again p(r) = 6{r — i) to get 
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To obtain the free energy we still need to determine vr. From Eq. [51] we have 
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Taking p(r) = 5(r — |) we obtain an equation for tt and for general m. 
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